Public Mass Shootings

Since 2009, there have been 278 mass shootings in the United States, resulting in 1569 people shot and killed and 1000 people shot and wounded. The purpose of this analysis is to explore correlations between public mass shootings versus unemployment and household median income. This could help public officials deploy more effective public policies based on observations seen in the data set. Hopefully helping address future incidents from happening.

Datasets

https://www.motherjones.com/politics/2012/12/mass-shootings-mother-jones-full-data/

https://www.icip.iastate.edu/tables/employment/unemployment-states

https://geodacenter.github.io/data-and-lab//houston/

I would like to develop this hypothesis to enable the United States Government to better strategize and improve the understanding of public mass shooting incidents.

This study should enable the US government to address trends of shooter background to target social policy more effectively.

I will also compare the incidents by Region Unemployment Rate of State

Shooting Data:

##      case             location             date             summary         
##  Length:126         Length:126         Length:126         Length:126        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##    fatalities        injured       total_victims     location.1       
##  Min.   : 3.000   Min.   :  0.00   Min.   :  3.00   Length:126        
##  1st Qu.: 4.000   1st Qu.:  1.00   1st Qu.:  6.25   Class :character  
##  Median : 6.000   Median :  3.00   Median : 10.00   Mode  :character  
##  Mean   : 7.968   Mean   : 11.58   Mean   : 19.55                     
##  3rd Qu.: 8.750   3rd Qu.: 10.00   3rd Qu.: 17.00                     
##  Max.   :58.000   Max.   :546.00   Max.   :604.00                     
##  age_of_shooter     prior_signs_mental_health_issues mental_health_details
##  Length:126         Length:126                       Length:126           
##  Class :character   Class :character                 Class :character     
##  Mode  :character   Mode  :character                 Mode  :character     
##                                                                           
##                                                                           
##                                                                           
##  weapons_obtained_legally where_obtained     weapon_type       
##  Length:126               Length:126         Length:126        
##  Class :character         Class :character   Class :character  
##  Mode  :character         Mode  :character   Mode  :character  
##                                                                
##                                                                
##                                                                
##  weapon_details         Race              Gender            sources         
##  Length:126         Length:126         Length:126         Length:126        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##  mental_health_sources sources_additional_age    latitude       longitude      
##  Length:126            Length:126             Min.   :21.32   Min.   :-157.88  
##  Class :character      Class :character       1st Qu.:33.80   1st Qu.:-117.56  
##  Mode  :character      Mode  :character       Median :38.32   Median : -90.87  
##                                               Mean   :37.38   Mean   : -96.84  
##                                               3rd Qu.:41.47   3rd Qu.: -81.51  
##                                               Max.   :48.46   Max.   : -71.08  
##      type                year     
##  Length:126         Min.   :1982  
##  Class :character   1st Qu.:2000  
##  Mode  :character   Median :2013  
##                     Mean   :2009  
##                     3rd Qu.:2017  
##                     Max.   :2022

Exploring Data Analysis:

The chart below shows an increase in fatality over the last few decades.

The Boxplot also shows us an increase in fatalities in recent years and an average fatality rate of 5-10 deceased in a shooting.
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
fatalities 126 7.968 7.721 3 4 8.75 58
injured 126 11.579 49.1 0 1 10 546
total_victims 126 19.548 54.514 3 6.25 17 604
Race 126
… Asian 8 6.3%
… Black 21 16.7%
… Latino 10 7.9%
… Native American 3 2.4%
… Unidentified 17 13.5%
… White 67 53.2%
Gender 126
… F 5 4%
… M 121 96%
latitude 126 37.384 5.623 21.32 33.8 41.468 48.461
longitude 126 -96.842 17.914 -157.876 -117.556 -81.508 -71.076
type 126
… Mass 107 84.9%
… Spree 19 15.1%
year 126 2008.968 10.563 1982 2000.25 2017 2022

Mosaic Display

(Exploratory Analysis of Dataset)

Fig. State Unemployment vs State Fatalities X-squared = 5.1174, df = 1, p-value = 0.02369 Fig. Mental Health vs State Fatalities: X-squared = 0.00043219, df = 1, p-value = 0.9834 Fig. Weapon Legality vs State Fatalities: X-squared = 0.50809, df = 1, p-value = 0.476 Fig. Gender vs Race: X-squared = 10.855, df = 5, p-value = 0.05433

Bivariate Analysis: Smoothing.

## `geom_smooth()` using formula 'y ~ x'

##   [1] 38.83755 39.68663 34.11165 31.92597 39.75731 31.77107 40.78514 27.47104
##   [9] 40.44390 38.99455 29.39282 36.05252 26.30483 40.05215 40.01876 29.27328
##  [17] 39.87637 36.09574 41.52955 39.95903 42.23669 38.88103 32.78839 34.43628
##  [25] 34.00862 43.04560 41.41232 44.97742 42.88585 39.70604 47.60383 33.74118
##  [33] 32.33594 35.33343 37.76721 41.92947 41.26572 45.57191 40.76065 39.95890
##  [41] 47.62290 34.42557 43.06057 39.96226 32.41084 42.50043 32.66451 33.85012
##  [49] 39.60403 44.04624 35.82099 41.68563 47.61864 35.05299 37.79297 42.38106
##  [57] 39.07869 42.48948 31.11712 38.25424 37.95770 37.36883 28.03319 35.66720
##  [65] 32.55200 32.92517 25.79649 38.60111 42.84411 37.31610 33.83542 39.98696
##  [73] 37.21043 30.36471 36.99719 34.17695 28.58029 48.46137 28.51972 34.07596
##  [81] 43.28954 35.04716 31.13556 27.82803 48.05082 41.48710 47.87635 41.84767
##  [89] 35.34939 39.10198 26.07275 44.20412 31.14172 25.86434 39.16380 27.96648
##  [97] 33.78779 43.04451 40.70736 36.75442 41.75373 39.45566 39.45219 36.74638
## [105] 30.43360 32.78011 38.13599 38.87498 47.31296 41.79876 47.15277 38.58009
## [113] 41.90816 33.55986 26.11927 39.67560 40.72677 30.33218 38.39250 37.76595
## [121] 37.80438 33.94121 42.09980 37.22957 21.32006 41.66069
##   [1] -104.81425  -86.32313  -84.58038 -102.27960  -84.18495 -106.37565
##   [7]  -77.83941  -81.45847  -79.92140  -76.54366  -95.14197  -86.61694
##  [13]  -80.26951  -79.38917 -122.39309  -98.05649 -104.98613 -115.17154
##  [19]  -75.94722  -82.59651  -85.67480 -104.84906  -79.93314 -119.87144
##  [25] -118.49475  -74.98489  -73.31142  -93.31041  -87.86314 -104.82059
##  [31] -122.33006 -118.10464 -110.97513  -79.41459  -87.55737  -88.75036
##  [37]  -96.06749  -88.90289 -111.89109  -76.08060 -122.31650 -119.86607
##  [43]  -88.10648  -83.00071  -88.63454  -71.07591  -97.38425  -84.37784
##  [49] -105.07410 -123.02203  -90.66826  -72.72984 -117.64836  -78.87871
##  [55] -122.39797  -76.87058 -121.54758  -83.14465  -97.72780  -85.75941
##  [61] -121.29078 -122.03635  -80.64297  -97.42937 -117.04308  -96.83868
##  [67]  -80.22668 -121.41897  -83.25993 -121.88853 -117.85379 -105.25117
##  [73]  -93.23686  -87.28857 -121.58482 -118.87479  -81.29409 -122.33792
##  [79]  -81.37678 -117.27789 -123.33319  -85.31182  -97.78366  -97.54820
##  [85] -122.17692 -120.54224  -95.01694  -87.62201 -118.91634  -84.51178
##  [91]  -80.14338  -88.46754  -97.77756  -80.31177 -119.76740  -82.57059
##  [97] -117.85311  -87.96254  -74.08361  -76.06038  -88.33106  -76.20848
## [103]  -76.30999 -119.80032  -91.08140  -96.80001  -97.42515  -76.99453
## [109] -122.33937  -72.57007 -122.46731  -90.40691  -87.87991  -81.72195
## [115]  -80.10412 -104.84485  -73.63430  -81.65565 -122.36653 -122.40609
## [121] -122.27082  -84.21353  -75.91772  -80.41394 -157.87646  -91.53022

Visualizing Clustering of Incidents

In order to visualize the clustering of incidents we join our data set by region to the map of the US. As we can see from the spatial mapping below the highest incidents of shootings have taken place in Nevada and Texas.

On observing the point clustering there seemed to visually be a few zones/ states in which the clustering of shootings was high. I wanted to see visually if there was a relationship between the shootings and demographic factors such as unemployment.

Joining shooting data with base map data:

First visualization of shooter data on base map plot:

##Visualizing clustering of incidents # Point Plot of shooting location:

##Visualizing clustering of incidents # Merging point plot with base map with labelling:

## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_text_repel).
## Warning: ggrepel: 91 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Combining Fatality gradient with shooter locations:

## Warning: ggrepel: 125 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
## Warning: ggrepel: 73 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Combining unemployment base map with shooter locations:

Below we see a map with a gradient of unemployment levels within states and the location of the shootings. To the eye it seems like there was a relationship between shootings and the unemployment levels.

It is observed that the lighter states (high unemployment) is where most of the clusters lie as compared to the darker states (low unemployment)

## Warning: ggrepel: 89 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

Variogram

The Variogram Analysis for the dataset unfortunately didn’t show a trend of spatial correlation. Leading to the selection of another dataset for further observation. The dataset selected shows the Homocide rates in the states with one of the highest Mass Shooting Fatality rates- Texas.

##     np      dist    gamma dir.hor dir.ver   id
## 1  248  1.029397 37.59879       0       0 var1
## 2  312  3.058207 68.24840       0       0 var1
## 3  394  5.045395 50.23096       0       0 var1
## 4  412  7.069526 78.65170       0       0 var1
## 5  470  9.108588 40.17872       0       0 var1
## 6  495 11.127538 73.86162       0       0 var1
## 7  529 13.203104 51.17108       0       0 var1
## 8  524 15.068132 62.92653       0       0 var1
## 9  525 17.183127 58.52952       0       0 var1
## 10 308 19.099637 58.01948       0       0 var1
## 11 278 21.183013 48.36871       0       0 var1
## 12 201 23.165882 57.58209       0       0 var1
## 13 290 25.313153 85.93621       0       0 var1
## 14 253 27.308062 88.89723       0       0 var1
## 15 251 29.197187 59.01594       0       0 var1

## Warning in fit.variogram(Vario_shootings, vgm(0.013, "Exp", 100, 1)): No
## convergence after 200 iterations: try different initial values?

No Real Convergence Observed

Moran I

Moran I (Fatalities)

For the first variable we have the Moran’s I coefficient at -0.02075940. While -1 is clustering of dissimilar values or perfect dispersion and 0 is no autocorrelation or perfect randomness. This tells us that the fatality in a shooting has little no clustering with their neighbors. This makes sense as the fatality of one shooting to another does not necessarily have a geographical correlation to other shootings.

Moran I (Injured)

For the second variable we have the Moran’s I coefficient at 0.003926306 . While 1 is perfect clustering of similar values or perfect dispersion and 0 is no autocorrelation or perfect randomness. This tells us that the injured in a shooting is still pretty random and has no consequence or correlation to what happens in a closest neighbor.

Moran I (Age of Shooter)

For the third variable we have the Moran’s I coefficient at 0.08664016 While 1 is perfect clustering of similar values and 0 is no autocorrelation or perfect randomness. This tells us that the age of a shooter is still pretty random but has slightly more correlation as compared fatality and injuries of a shooting. The coefficient is not conclusive but begins to wonder if there are any demographic influences causing people of a certain age in a region to act a certain way. To know more we would need a better dataset.

Analysis

The coefficients indicate that the US shooting incidences show low autocorrelation to one another. It could be because of the premise of the data set is mass or spree shootings which implies more than 4 fatalities. It would be interesting to combine fatalities data with homicide rate data to be able to see a larger data bank of crime activity. For the moment our data suggests spatial independence.

## [1] FALSE
## 
##  Moran I test under randomisation
## 
## data:  SHOOTINGS$fatalities  
## weights: nb2listw(SHOOTINGS_nb, style = "W", zero.policy = TRUE)  n reduced by no-neighbour observations
##   
## 
## Moran I statistic standard deviate = 0.019393, p-value = 0.4923
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##       -0.02075940       -0.02325581        0.01657035

## [1] TRUE
## 
##  Moran I test under randomisation
## 
## data:  SHOOTINGS$injured  
## weights: nb2listw(SHOOTINGS_nb, style = "W", zero.policy = TRUE)  n reduced by no-neighbour observations
##   
## 
## Moran I statistic standard deviate = NaN, p-value = NA
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##       0.003926306      -0.023255814      -0.059827141

##  [1] 39.68663 39.98696 41.75373 27.47104 34.17695 39.10198 36.05252 38.39250
##  [9] 26.30483 37.76595 26.07275 44.20412 34.43628 38.87498 34.00862 47.31296
## [17] 43.04560 41.41232 44.97742 42.88585 39.70604 47.60383 37.80438 33.94121
## [25] 33.74118 39.16380 32.33594 41.79876 47.15277 42.09980 35.33343 37.76721
## [33] 41.92947 38.58009 41.26572 37.22957 39.95890 47.62290 34.42557 47.87635
## [41] 43.06057 39.96226 32.41084 41.90816 42.50043 27.96648 21.32006 32.66451
## [49] 33.85012 39.60403 44.04624 35.82099 41.68563 33.78779 33.55986 26.11927
## [57] 27.82803 47.61864 40.72677 35.05299 37.79297 42.38106 39.07869 42.48948
## [65] 31.11712 30.33218 38.25424 37.95770 37.36883 28.03319 32.55200 32.92517
## [73] 25.79649
##  [1]  -86.32313 -105.25117  -88.33106  -81.45847 -118.87479  -84.51178
##  [7]  -86.61694 -122.36653  -80.26951 -122.40609  -80.14338  -88.46754
## [13] -119.87144  -76.99453 -118.49475 -122.33937  -74.98489  -73.31142
## [19]  -93.31041  -87.86314 -104.82059 -122.33006 -122.27082  -84.21353
## [25] -118.10464 -119.76740 -110.97513  -72.57007 -122.46731  -75.91772
## [31]  -79.41459  -87.55737  -88.75036  -90.40691  -96.06749  -80.41394
## [37]  -76.08060 -122.31650 -119.86607  -95.01694  -88.10648  -83.00071
## [43]  -88.63454  -87.87991  -71.07591  -82.57059 -157.87646  -97.38425
## [49]  -84.37784 -105.07410 -123.02203  -90.66826  -72.72984 -117.85311
## [55]  -81.72195  -80.10412  -97.54820 -117.64836  -73.63430  -78.87871
## [61] -122.39797  -76.87058 -121.54758  -83.14465  -97.72780  -81.65565
## [67]  -85.75941 -121.29078 -122.03635  -80.64297 -117.04308  -96.83868
## [73]  -80.22668
## [1] TRUE
## 
##  Moran I test under randomisation
## 
## data:  data_shootings2$age_of_shooter  
## weights: nb2listw(data_shootings2_nb, style = "W", zero.policy = TRUE)  n reduced by no-neighbour observations
##   
## 
## Moran I statistic standard deviate = 0.49111, p-value = 0.3117
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##        0.08664016       -0.04761905        0.07473523

Semivariograms

An omnidirectional semivariogram is computed and modelled to explore the overall spatial continuity of the dataset. With increasing spatial separation ( h ), the correlation between two data points decreases. The range is the distance at which two data points become independent.

Semivariograms (Fatalities)

A general trend of increased or decreased semivariance was not observed. This would be due to the fact of randomness in the shooting data set or spatial independence.

Semivariograms (Injured)

A slight trend of increased semivariance was observed initially but the eventual decrease again suggests spatial independence. This would be due to the fact of randomness in the shooting data set.

Semivariograms (Age of Shooter)

Overall the semivariance across distance fluctuated between a semivariance range of 125 - 175.

Analysis

Overall we noticed similar results to Morans I where there was no real observed trend of of semivariance with respect to distance.

##  [1] 39.68663 39.98696 41.75373 27.47104 34.17695 39.10198 36.05252 38.39250
##  [9] 26.30483 37.76595 26.07275 44.20412 34.43628 38.87498 34.00862 47.31296
## [17] 43.04560 41.41232 44.97742 42.88585 39.70604 47.60383 37.80438 33.94121
## [25] 33.74118 39.16380 32.33594 41.79876 47.15277 42.09980 35.33343 37.76721
## [33] 41.92947 38.58009 41.26572 37.22957 39.95890 47.62290 34.42557 47.87635
## [41] 43.06057 39.96226 32.41084 41.90816 42.50043 27.96648 21.32006 32.66451
## [49] 33.85012 39.60403 44.04624 35.82099 41.68563 33.78779 33.55986 26.11927
## [57] 27.82803 47.61864 40.72677 35.05299 37.79297 42.38106 39.07869 42.48948
## [65] 31.11712 30.33218 38.25424 37.95770 37.36883 28.03319 32.55200 32.92517
## [73] 25.79649
##  [1]  -86.32313 -105.25117  -88.33106  -81.45847 -118.87479  -84.51178
##  [7]  -86.61694 -122.36653  -80.26951 -122.40609  -80.14338  -88.46754
## [13] -119.87144  -76.99453 -118.49475 -122.33937  -74.98489  -73.31142
## [19]  -93.31041  -87.86314 -104.82059 -122.33006 -122.27082  -84.21353
## [25] -118.10464 -119.76740 -110.97513  -72.57007 -122.46731  -75.91772
## [31]  -79.41459  -87.55737  -88.75036  -90.40691  -96.06749  -80.41394
## [37]  -76.08060 -122.31650 -119.86607  -95.01694  -88.10648  -83.00071
## [43]  -88.63454  -87.87991  -71.07591  -82.57059 -157.87646  -97.38425
## [49]  -84.37784 -105.07410 -123.02203  -90.66826  -72.72984 -117.85311
## [55]  -81.72195  -80.10412  -97.54820 -117.64836  -73.63430  -78.87871
## [61] -122.39797  -76.87058 -121.54758  -83.14465  -97.72780  -81.65565
## [67]  -85.75941 -121.29078 -122.03635  -80.64297 -117.04308  -96.83868
## [73]  -80.22668

Spatial Modeling

library(rgdal)
spat.data = readOGR("NAT.shp")
## OGR data source with driver: ESRI Shapefile 
## Source: "/Users/samarthgwalani/Desktop/Harrisburg/Analysis-555/Finals/NAT.shp", layer: "NAT"
## with 3085 features
## It has 70 fields
## Integer64 fields read as strings:  STFIPS COFIPS FIPSNO
names(spat.data) #show variable names
##  [1] "LAG1"       "NAME"       "STATE_NAME" "STATE_FIPS" "CNTY_FIPS" 
##  [6] "FIPS"       "STFIPS"     "COFIPS"     "FIPSNO"     "SOUTH"     
## [11] "HR60"       "HR70"       "HR80"       "HR90"       "HC60"      
## [16] "HC70"       "HC80"       "HC90"       "PO60"       "PO70"      
## [21] "PO80"       "PO90"       "RD60"       "RD70"       "RD80"      
## [26] "RD90"       "PS60"       "PS70"       "PS80"       "PS90"      
## [31] "UE60"       "UE70"       "UE80"       "UE90"       "DV60"      
## [36] "DV70"       "DV80"       "DV90"       "MA60"       "MA70"      
## [41] "MA80"       "MA90"       "POL60"      "POL70"      "POL80"     
## [46] "POL90"      "DNL60"      "DNL70"      "DNL80"      "DNL90"     
## [51] "MFIL59"     "MFIL69"     "MFIL79"     "MFIL89"     "FP59"      
## [56] "FP69"       "FP79"       "FP89"       "BLK60"      "BLK70"     
## [61] "BLK80"      "BLK90"      "GI59"       "GI69"       "GI79"      
## [66] "GI89"       "FH60"       "FH70"       "FH80"       "FH90"
spat.data$HR70 <- as.numeric(as.character(spat.data$HR70))

spplot(spat.data,"HR70") #make map

library(spdep)

queen.nb=poly2nb(spat.data) 
rook.nb=poly2nb(spat.data,queen=FALSE) 

queen.listw=nb2listw(queen.nb) #convert nb to listw type
rook.listw=nb2listw(rook.nb) #convert nb to listw type
listw1= queen.listw

#define our regression equation so we don't have to type it each time
reg.eq1= HR70~HC70+FP69+GI69

OLS, SLX, Lag Y, and Lag Error

1 - OLS

reg1=lm(reg.eq1,data=spat.data)
summary(reg1)
## 
## Call:
## lm(formula = reg.eq1, data = spat.data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -33.660  -3.832  -1.268   2.463  68.496 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   6.242687   1.551440   4.024 5.86e-05 ***
## HC70          0.037539   0.003293  11.399  < 2e-16 ***
## FP69          0.374423   0.021129  17.721  < 2e-16 ***
## GI69        -17.530610   4.926354  -3.559 0.000379 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.574 on 3081 degrees of freedom
## Multiple R-squared:  0.2014, Adjusted R-squared:  0.2006 
## F-statistic: 258.9 on 3 and 3081 DF,  p-value: < 2.2e-16
lm.morantest(reg1,listw1)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
## 
## Moran I statistic standard deviate = 26.968, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     0.2891894281    -0.0006973051     0.0001155469
lm.LMtests(reg1,listw1,test=c("LMerr", "LMlag", "RLMerr", "RLMlag", "SARMA"))
## 
##  Lagrange multiplier diagnostics for spatial dependence
## 
## data:  
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
## 
## LMerr = 720.85, df = 1, p-value < 2.2e-16
## 
## 
##  Lagrange multiplier diagnostics for spatial dependence
## 
## data:  
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
## 
## LMlag = 834.12, df = 1, p-value < 2.2e-16
## 
## 
##  Lagrange multiplier diagnostics for spatial dependence
## 
## data:  
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
## 
## RLMerr = 0.018748, df = 1, p-value = 0.8911
## 
## 
##  Lagrange multiplier diagnostics for spatial dependence
## 
## data:  
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
## 
## RLMlag = 113.29, df = 1, p-value < 2.2e-16
## 
## 
##  Lagrange multiplier diagnostics for spatial dependence
## 
## data:  
## model: lm(formula = reg.eq1, data = spat.data)
## weights: listw1
## 
## SARMA = 834.14, df = 2, p-value < 2.2e-16

2 - SLX or Spatially Lagged X

library(spatialreg)
## Warning: package 'spatialreg' was built under R version 4.1.2
## Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack
## 
## Attaching package: 'spatialreg'
## The following objects are masked from 'package:spdep':
## 
##     get.ClusterOption, get.coresOption, get.mcOption,
##     get.VerboseOption, get.ZeroPolicyOption, set.ClusterOption,
##     set.coresOption, set.mcOption, set.VerboseOption,
##     set.ZeroPolicyOption
reg2.c=lmSLX(reg.eq1,data=spat.data, listw1)

summary(reg2.c)
## 
## Call:
## lm(formula = formula(paste("y ~ ", paste(colnames(x)[-1], collapse = "+"))), 
##     data = as.data.frame(x), weights = weights)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -30.306  -3.570  -1.259   2.247  68.109 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  13.640049   2.321424   5.876 4.66e-09 ***
## HC70          0.035179   0.003266  10.770  < 2e-16 ***
## FP69          0.151785   0.029623   5.124 3.18e-07 ***
## GI69         -0.997493   5.741752  -0.174    0.862    
## lag.HC70      0.027458   0.005699   4.818 1.52e-06 ***
## lag.FP69      0.394088   0.039838   9.892  < 2e-16 ***
## lag.GI69    -44.738152   8.670074  -5.160 2.63e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.442 on 3078 degrees of freedom
## Multiple R-squared:  0.2339, Adjusted R-squared:  0.2324 
## F-statistic: 156.6 on 6 and 3078 DF,  p-value: < 2.2e-16
impacts(reg2.c,listw=listw1)
## Impact measures (SlX, estimable):
##           Direct     Indirect       Total
## HC70  0.03517931   0.02745829   0.0626376
## FP69  0.15178502   0.39408780   0.5458728
## GI69 -0.99749277 -44.73815226 -45.7356450
summary(impacts(reg2.c,listw=listw1,R=500),zstats=TRUE) #Add zstats,pvals; R=500 not needed for SLX
## Impact measures (SlX, estimable, n-k):
##           Direct     Indirect       Total
## HC70  0.03517931   0.02745829   0.0626376
## FP69  0.15178502   0.39408780   0.5458728
## GI69 -0.99749277 -44.73815226 -45.7356450
## ========================================================
## Standard errors:
##           Direct    Indirect       Total
## HC70 0.003266274 0.005698966 0.006251615
## FP69 0.029623416 0.039837611 0.029779251
## GI69 5.741751764 8.670073513 7.413703251
## ========================================================
## Z-values:
##          Direct  Indirect     Total
## HC70 10.7704706  4.818119 10.019427
## FP69  5.1238188  9.892355 18.330643
## GI69 -0.1737262 -5.160066 -6.169069
## 
## p-values:
##      Direct     Indirect   Total     
## HC70 < 2.22e-16 1.4492e-06 < 2.22e-16
## FP69 2.9941e-07 < 2.22e-16 < 2.22e-16
## GI69 0.86208    2.4686e-07 6.8693e-10

#3 - Spatial Lag Model

reg3=lagsarlm(reg.eq1,data= spat.data, listw1)
summary(reg3)
## 
## Call:lagsarlm(formula = reg.eq1, data = spat.data, listw = listw1)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -26.75969  -3.00971  -0.95129   1.88613  67.64144 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept)  2.0142358  1.3705152  1.4697   0.1416
## HC70         0.0317994  0.0029064 10.9411   <2e-16
## FP69         0.1940177  0.0194051  9.9983   <2e-16
## GI69        -6.6096360  4.3444017 -1.5214   0.1282
## 
## Rho: 0.51827, LR test value: 606.04, p-value: < 2.22e-16
## Asymptotic standard error: 0.020352
##     z-value: 25.466, p-value: < 2.22e-16
## Wald statistic: 648.49, p-value: < 2.22e-16
## 
## Log likelihood: -9881.772 for lag model
## ML residual variance (sigma squared): 33.562, (sigma: 5.7933)
## Number of observations: 3085 
## Number of parameters estimated: 6 
## AIC: 19776, (AIC for lm: 20380)
## LM test for residual autocorrelation
## test value: 82.747, p-value: < 2.22e-16
impacts(reg2.c,listw=listw1)
## Impact measures (SlX, estimable):
##           Direct     Indirect       Total
## HC70  0.03517931   0.02745829   0.0626376
## FP69  0.15178502   0.39408780   0.5458728
## GI69 -0.99749277 -44.73815226 -45.7356450
summary(impacts(reg2.c,listw=listw1,R=500),zstats=TRUE) #Add zstats,pvals; R=500 not needed for SLX
## Impact measures (SlX, estimable, n-k):
##           Direct     Indirect       Total
## HC70  0.03517931   0.02745829   0.0626376
## FP69  0.15178502   0.39408780   0.5458728
## GI69 -0.99749277 -44.73815226 -45.7356450
## ========================================================
## Standard errors:
##           Direct    Indirect       Total
## HC70 0.003266274 0.005698966 0.006251615
## FP69 0.029623416 0.039837611 0.029779251
## GI69 5.741751764 8.670073513 7.413703251
## ========================================================
## Z-values:
##          Direct  Indirect     Total
## HC70 10.7704706  4.818119 10.019427
## FP69  5.1238188  9.892355 18.330643
## GI69 -0.1737262 -5.160066 -6.169069
## 
## p-values:
##      Direct     Indirect   Total     
## HC70 < 2.22e-16 1.4492e-06 < 2.22e-16
## FP69 2.9941e-07 < 2.22e-16 < 2.22e-16
## GI69 0.86208    2.4686e-07 6.8693e-10

4 - SEM Spatial Error Model

y=XB+u, u=LWu+e

reg4=errorsarlm(reg.eq1,data=spat.data, listw1)
summary(reg4)
## 
## Call:errorsarlm(formula = reg.eq1, data = spat.data, listw = listw1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -23.8804  -3.1737  -1.0609   1.9782  68.8418 
## 
## Type: error 
## Coefficients: (asymptotic standard errors) 
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept)  2.7535487  1.6277867  1.6916  0.09072
## HC70         0.0286870  0.0028887  9.9309  < 2e-16
## FP69         0.2259224  0.0239035  9.4514  < 2e-16
## GI69        -1.1442027  5.0732597 -0.2255  0.82156
## 
## Lambda: 0.53997, LR test value: 547.23, p-value: < 2.22e-16
## Asymptotic standard error: 0.021007
##     z-value: 25.704, p-value: < 2.22e-16
## Wald statistic: 660.71, p-value: < 2.22e-16
## 
## Log likelihood: -9911.177 for error model
## ML residual variance (sigma squared): 34.024, (sigma: 5.833)
## Number of observations: 3085 
## Number of parameters estimated: 6 
## AIC: 19834, (AIC for lm: 20380)

#Spatial Hausman Test

Hausman.test(reg4)
## 
##  Spatial Hausman test (asymptotic)
## 
## data:  NULL
## Hausman test = 89.882, df = 4, p-value < 2.2e-16